CLAIMS 

1. A text-processing method characterized by 

2 comprising the steps of: 

3 generating a probability model in which 

4 information indicating which word of a text document 

5 belongs to which topic is made to correspond to a latent 

6 variable and each word of the text document is made to 

7 correspond to an observable variable; 



8 outputting an initial value of a model 

9 parameter which defines the generated probability model; 

10 estimating a model parameter corresponding to 

11 a text document as a processing target on the basis of 

12 the output initial value of the model parameter and the 

13 text document; and 

14 segmenting the text document as the processing 

15 target for each topic on the basis of the estimated 

16 model parameter. 

2. A text-processing method according to 

2 claim 1 , characterized in that 

3 the step of generating a probability model 

4 comprises the step of generating a plurality of 

5 probability models , 

6 the step of outputting an initial value of the 

7 model parameter comprises the step of outputting an 

8 initial value of a model parameter for each of the 

9 plurality of probability models , 

10 the step of estimating a model parameter 
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11 comprises the step of estimating a model parameter for 

12 each of the plurality of probability models, and 

13 the method further comprises the step of 

14 selecting a probability model, from the plurality of 

15 probability models, which is used to perform processing 

16 in the step of segmenting the text document, on the 

17 basis of the plurality of estimated model parameters. 

3. A text-processing method according to 

2 claim 1, characterized in that a probability model is a 

3 hidden Markov model . 

4. A text-processing method according to 

2 claim 3, characterized in that the hidden Markov model 

3 has a unidirectional structure. 

5. A text-processing method according to 

2 claim 3, characterized in the hidden Markov model is of 

3 a discrete output type. 

6. A text-processing method according to 

2 claim 1, characterized in that the step of estimating a 

3 model parameter comprises the step of estimating a model 

4 parameter by using one of maximum likelihood estimation 

5 and maximum a posteriori estimation. 

7. A text-processing method according to 

2 claim 1, characterized in that 

3 the step of outputting an initial value of a 

4 model parameter comprises the step of hypothesizing a 

5 distribution using the model parameter as a probability 

6 variable, and outputting an initial value of a 
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7 hyper -parameter defining the distribution, and 

8 the step of estimating a model parameter 

9 comprises the step of estimating a hyper-parameter 

10 corresponding to a text document as a processing target 

11 on the basis of the output initial value of the 

12 hyper-parameter and the text document. 

8. A text -processing method according to 

2 claim 7 , characterized in that the step of estimating a 

3 hyper -parameter comprises the step of estimating a 

4 hyper -parameter by using Bayes estimation. 

9. A text-processing method according to 

2 claim 2, characterized in that the step of selecting a 

3 probability model comprises the step of selecting a 

4 probability model by using one of an Akaike's 

5 information criterion, a minimum description length 

6 criterion, and a Bayes posteriori probability. 

10. A program for causing a computer to 

2 execute the steps of : 

3 generating a probability model in which 

4 information indicating which word of a text document 

5 belongs to which topic is made to correspond to a latent 

6 variable and each word of the text document is made to 

7 correspond to an observable variable ; 

8 outputting an initial value of a model 

9 parameter which defines the generated probability model; 

10 estimating a model parameter corresponding to 

11 a text document as a processing target on the basis of 
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12 the output initial value of the model parameter and the 

13 text document; and 

14 segmenting the text document as the processing 

15 target for each topic on the basis of the estimated 

16 model parameter. 

11. A recording medium recording a program for 

2 causing a computer to execute the steps of: 

3 generating a probability model in which 



4 information indicating which word of a text document 

5 belongs to which topic is made to correspond to a latent 

6 variable and each word of the text document is made to 

7 correspond to an observable variable; 



8 outputting an initial value of a model 

9 parameter which defines the generated probability model; 

10 estimating a model parameter corresponding to 

11 a text document as a processing target on the basis of 

12 the output initial value of the model parameter and the 

13 text document ; and 

14 segmenting the text document as the processing 

15 target for each topic on the basis of the estimated 

16 model parameter. 

12. A text-processing device characterized by 

2 comprising: 

3 temporary model generating means for 

4 generating a probability model in which information 

5 indicating which word of a text document belongs to 

6 which topic is made to correspond to a latent variable 
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7 and each word of the text document is made to correspond 

8 to an observable variable; 

9 model parameter initializing means for 

10 outputting an initial value of a model parameter which 

11 defines the probability model generated by said 

12 temporary model generating means; 

13 model parameter estimating means for 



14 estimating a model parameter corresponding to a text 

15 document as a processing target on the basis of the 

16 initial value of the model parameter output from said 

17 model parameter initializing means and the text 

18 document; and 

19 text segmentation result output means for 

20 segmenting the text document as the processing target 

21 for each topic on the basis of the model parameter 

22 estimated by said model parameter estimating means. 

13. A text -processing device according to 

2 claim 12, characterized in that 



3 said temporary model generating means 

4 comprises means for generating a plurality of 

5 probability models , 

6 said model parameter initializing means 

7 comprises means for outputting an initial value of a 

8 model parameter for each of the plurality of probability 

9 models , 

10 said model parameter estimating means 

11 comprises means for estimating a model parameter for 
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12 each of the plurality of probability models, and 

13 the device further comprises model selecting 

14 means for selecting a probability model, from the 

15 plurality of probability models, which is used to cause 

16 said text segmentation result output means to perform 

17 processing associated with the probability model, on the 

18 basis of the plurality of model parameters estimated by 

19 said model parameter estimating means. 

14. A text -processing device according to 

2 claim 12, characterized in that a probability model is a 

3 hidden Markov model. 

15. A text -processing device according to 

2 claim 14, characterized in that the hidden Markov model 

3 has a unidirectional structure. 

16. A text-processing device according to 

2 claim 14, characterized in the hidden Markov model is of 

3 a discrete output type. 

17. A text-processing device according to 

2 claim 12, characterized in that said model parameter 

3 estimating means comprises means for estimating a model 

4 parameter by using one of maximum likelihood estimation 

5 and maximum a posteriori estimation. 

18. A text-processing device according to 

2 claim 12, characterized in that 

3 said model parameter initializing means 

4 comprises means for hypothesizing a distribution using 

5 the model parameter as a probability variable, and 
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6 outputting an initial value of a hyper-parameter 

7 defining the distribution, and 

8 said model parameter estimating means 

9 comprises means for estimating a hyper-parameter 

10 corresponding to a text document as a processing target 

11 on the basis of the output initial value of the 

12 hyper -parameter and the text document. 

19. A text-processing device according to 

2 claim 18 , characterized in that said model parameter 

3 estimating means comprises means for estimating a 

4 hyper-parameter by using Bayes estimation. 

20. A text-processing device according to 

2 claim 13, characterized in that said model selecting 

3 means comprises means for selecting a probability model 

4 by using one of an Akaike's information criterion, a 

5 minimum description length criterion, and a Bayes 

6 posteriori probability. 
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